On-demand language model interpolation for mobile speech input

نویسندگان

Brandon Ballinger

Cyril Allauzen

Alexander Gruenstein

Johan Schalkwyk

چکیده

Google offers several speech features on the Android mobile operating system: search by voice, voice input to any text field, and an API for application developers. As a result, our speech recognition service must support a wide range of usage scenarios and speaking styles: relatively short search queries, addresses, business names, dictated SMS and e-mail messages, and a long tail of spoken input to any of the applications users may install. We present a method of on-demand language model interpolation in which contextual information about each utterance determines interpolation weights among a number of n-gram language models. On-demand interpolation results in an 11.2% relative reduction in WER compared to using a single language model to handle all traffic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum Entropy Language Model Adaptation for Mobile Speech Input

This paper describes unsupervised adaptation of language model for many related target domains. In mobile speech input, subject and vocabulary of the language depend highly on the usage context. We use automatically transcribed speech data to select a subset from the language model training data for building a maximum entropy model adapted to speech input. This model is further adapted for most...

متن کامل

Bayesian Language Model Interpolation for Mobile Speech Input

This paper explores various static interpolation methods for approximating a single dynamically-interpolated language model used for a variety of recognition tasks on the Google Android platform. The goal is to find the statically-interpolated firstpass LM that best reduces search errors in a two-pass system or that even allows eliminating the more complex dynamic second pass entirely. Static i...

متن کامل

Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts

: This study investigated the impact of audio-visual input enhancement teaching techniques on improving English as Foreign Language (EFL) learnersˈ collocation learning as well as their accuracy concerning collocation use in narrative writing. In addition, it compared the impact and efficiency of audio-visual input enhancement in two learning contexts, namely traditional and mo...

متن کامل

Combining language models in the input interface of a spoken dialogue system

This paper presents a new technique to enhance the performance of the input interface of spoken dialogue systems based on a procedure that combines during speech recognition the advantages of using prompt-dependent language models with those of using a language model independent of the prompts generated by the dialogue system. The technique proposes to create a new speech recognizer, termed con...

متن کامل

Waveform Interpolation Speech Coder at 4 kb/s

Speech coding at bit rates near 4 kbps is expected to be widely deployed in applications such as visual telephony, mobile and personal communications. This research focuses on developing a speech coder based on the waveform interpolation (WI) scheme, with an attempt to deliver near toll-quality speech at rates around 4 kbps. A WI coder has been simulated in floating-point using the C programmin...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

On-demand language model interpolation for mobile speech input

نویسندگان

چکیده

منابع مشابه

Maximum Entropy Language Model Adaptation for Mobile Speech Input

Bayesian Language Model Interpolation for Mobile Speech Input

Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts

Combining language models in the input interface of a spoken dialogue system

Waveform Interpolation Speech Coder at 4 kb/s

عنوان ژورنال:

اشتراک گذاری